import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/rromanss23/Machine_Leaning_Engineer_Udacity_NanoDegree/master/projects/boston_housing/housing.csv')
# !pip install pandas_profiling
# Importamos el módulo
from pandas_profiling import ProfileReport
# Generamos el reporte
profile = ProfileReport(df, title='Boston house pricing')
#Mostramos el reporte
# profile.to_widgets()
profile.to_file("Pandas_Profile.html")
Summarize dataset: 0%| | 0/5 [00:00<?, ?it/s]
Generate report structure: 0%| | 0/1 [00:00<?, ?it/s]
Render HTML: 0%| | 0/1 [00:00<?, ?it/s]
Export report to file: 0%| | 0/1 [00:00<?, ?it/s]
# !pip install sweetviz
# Importamos el módulo
import sweetviz as sv
# Generamos el reporte
my_report = sv.analyze(df)
| | [ 0%] 00:00 -> (? left)
# El reporte se puede exportar a HTML o previsualizarlo en el notebook:
my_report.show_html() # Exporta a HTML
# my_report.show_notebook() # Previasualiza en el notebook
Report SWEETVIZ_REPORT.html was generated! NOTEBOOK/COLAB USERS: the web browser MAY not pop up, regardless, the report IS saved in your notebook/colab files.
#! pip install dataprep
from dataprep.eda import create_report
report = create_report(df, title = 'Boston house pricing')
0%| | 0/799 [00:00<?, ?it/s]
/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/dask/core.py:119: RuntimeWarning: invalid value encountered in divide return func(*(_execute_task(a, cache) for a in args))
report.save('dataprep_report')
Report has been saved to dataprep_report.html!
#!pip install autoviz
from autoviz.AutoViz_Class import AutoViz_Class
AV = AutoViz_Class()
df = AV.AutoViz('https://raw.githubusercontent.com/rromanss23/Machine_Leaning_Engineer_Udacity_NanoDegree/master/projects/boston_housing/housing.csv')
Shape of your Data Set loaded: (489, 4) ####################################################################################### ######################## C L A S S I F Y I N G V A R I A B L E S #################### ####################################################################################### Classifying variables in data set... Data cleaning improvement suggestions. Complete them before proceeding to ML modeling.
| Nuniques | dtype | Nulls | Nullpercent | NuniquePercent | Value counts Min | Data cleaning improvement suggestions | |
|---|---|---|---|---|---|---|---|
| LSTAT | 442 | float64 | 0 | 0.000000 | 90.388548 | 0 | |
| RM | 430 | float64 | 0 | 0.000000 | 87.934560 | 0 | |
| MEDV | 228 | float64 | 0 | 0.000000 | 46.625767 | 0 | |
| PTRATIO | 44 | float64 | 0 | 0.000000 | 8.997955 | 0 |
4 Predictors classified...
No variables removed since no ID or low-information variables found in data set
Number of All Scatter Plots = 10
No categorical or numeric vars in data set. Hence no bar charts.
All Plots done
Time to run AutoViz = 2 seconds
###################### AUTO VISUALIZATION Completed ########################
#!pip install jupyter_contrib_nbextensions
#!pip install -U nbconvert==5.6.1